Here we assess the activity of kākā in the Orokonui Ecosanctuary
using hidden Markov models (HMMs). We use the momentuHMM
package to fit the HMMs. For activity we use the overall dynamic body
acceleration (ODBA) as a proxy. ODBA is a measure of acceleration that
is calculated from the tri-axial accelerometer data.
The ungeviz package is used to create short vertical line segments to create the actograms using ggplot.
# devtools::install_github("wilkelab/ungeviz")
library(tidyverse)
packages <- c("ggplot2", "lubridate", "magrittr", "forcats", "devtools", "momentuHMM", "ungeviz", "ggpubr", "oce", "tictoc", "amt", "terra")
walk(packages, require, character.only = T)
activity_files <- paste0("data/", list.files(path = "data", pattern = "2021")) # writes .csv file names to chr vector - 2021 is common across all the datasets of interest
all_activity <- map(activity_files, read.csv, skip = 4) # reads csv files into list of dataframes
The tags are the identifiers for each GPS device, which we used as the ID. The GMT time is the time the GPS device recorded the data, which we converted to New Zealand time. The temperature column was renamed to be more descriptive, and the original column was removed. We did not use the temperature column for any analyses.
tags <- c("A05","A06","A07","A08","A09","A10","A11","A12","A13","A14")
for (i in 1:length(all_activity)) {
all_activity[[i]]$ID <- tags[i] # add ID column
all_activity[[i]]$DateTimeGMT <- as.POSIXct(all_activity[[i]]$GMT.Time, format = "%d/%m/%Y %I:%M:%S %p", tz = "UTC") # create POSIXct time
all_activity[[i]]$DateTimeNZ <- with_tz(all_activity[[i]]$DateTimeGMT, "Pacific/Auckland")
all_activity[[i]]$Temp <- all_activity[[i]]$Temperature..C. # rename temperature
all_activity[[i]]$Temperature..C. <- NULL # remove untidy column
}
names(all_activity) <- tags
all_activity_df <- bind_rows(all_activity, .id = "ID") # bind all activity data into one dataframe
To assess the distributions of the data, we plotted histograms of the ODBA and log(ODBA). Here we are looking at whether the data looks like it separates out into clear states. The log of ODBA helps to assess the states that might be present in the data when there is a large range of values and a right-skewed distribution.
It appears there are quite clearly two distinct activity states, one with a higher ODBA and one with a lower ODBA. There may be a third state, but it is not as clear.
for(i in 1:length(all_activity)) {
# histogram of ODBA
print(all_activity[[i]] |> ggplot(aes(ODBA)) +
geom_histogram(binwidth = 5, alpha = 1, fill = "orange") +
ggtitle(paste0("Kākā ID: ", tags[i])) +
scale_x_continuous("Overall Dynamic Body Acceleration (ODBA)") +
scale_y_continuous("Frequency") +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
))
# uncomment to save the plot
# ggsave(paste0("outputs/plots/histogram_ODBA_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
}
The log of ODBA also indicates a possible 3rd state, which is clearer for some individuals than others.
for(i in 1:length(all_activity)) {
# histogram of log(ODBA)
print(all_activity[[i]] |> ggplot(aes(log(ODBA))) +
geom_histogram(binwidth = 0.025, alpha = 1, fill = "orange") +
ggtitle(paste0("Kākā ID: ", tags[i])) +
scale_x_continuous(expression("log of Overall Dynamic Body Acceleration (ODBA"[log]*")")) +
scale_y_continuous("Frequency") +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
))
# uncomment to save the plot
# ggsave(paste0("outputs/plots/histogram_logODBA_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
}
Here we put the data in an object that can be used in the HMM. We
used the prepData function from the momentuHMM
package. The prepData function takes a data frame and a
vector of column names for the coordinates, although as we don’t have
coordinates we ignore this argument. We used the prepData
function to create a list of data frames, one for each individual.
# preparing as a single data frame
all_prepped <- prepData(data = all_activity_df, coordNames = NULL)
head(all_prepped, 10)
## ID GMT.Time ODBA DateTimeGMT DateTimeNZ Temp
## 1 A05 22/09/2020 1:18:00 am 246 2020-09-22 01:18:00 2020-09-22 13:18:00 20.0
## 2 A05 22/09/2020 1:19:00 am 146 2020-09-22 01:19:00 2020-09-22 13:19:00 19.5
## 3 A05 22/09/2020 1:20:00 am 140 2020-09-22 01:20:00 2020-09-22 13:20:00 19.5
## 4 A05 22/09/2020 1:21:00 am 128 2020-09-22 01:21:00 2020-09-22 13:21:00 19.0
## 5 A05 22/09/2020 1:22:00 am 37 2020-09-22 01:22:00 2020-09-22 13:22:00 19.0
## 6 A05 22/09/2020 1:23:00 am 105 2020-09-22 01:23:00 2020-09-22 13:23:00 19.0
## 7 A05 22/09/2020 1:24:00 am 336 2020-09-22 01:24:00 2020-09-22 13:24:00 19.0
## 8 A05 22/09/2020 1:25:00 am 404 2020-09-22 01:25:00 2020-09-22 13:25:00 19.0
## 9 A05 22/09/2020 1:26:00 am 246 2020-09-22 01:26:00 2020-09-22 13:26:00 19.0
## 10 A05 22/09/2020 1:27:00 am 418 2020-09-22 01:27:00 2020-09-22 13:27:00 18.5
# preparing as a list
all_prepped <- vector(mode = "list", length = length(all_activity))
for (i in 1:length(all_activity)) {
all_prepped[[i]] <- prepData(data = all_activity[[i]], coordNames = NULL)
}
We used the fitHMM function to create a list of HMMs,
one for each individual as we looped over the data frames. We then used
the plot function to plot the HMMs. We then used the
viterbi function to fit the viterbi algorithm to the HMMs,
which assigns state labels to each ODBA value. We then used the
table function to calculate the proportion of time spent in
each state.
all_hmms <- vector(mode = "list", length = length(all_activity))
for(i in 1:length(all_activity)) {
tic()
all_hmms[[i]] <- fitHMM(all_prepped[[i]],
nbStates = 2,
dist = dist,
Par0 = Par0,
stateNames = stateNames)
toc()
plot(all_hmms[[i]], plotCI = T, breaks = 50, ask = FALSE)
states <- viterbi(all_hmms[[i]]) # fit viterbi algorithm
table(states)/nrow(all_prepped[[i]]) # proportion of time spent in states
all_prepped[[i]]$states <- as.factor(states) # add states column to DF
}
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 49.01 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 82.71 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 87.49 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 53.97 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 29.59 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 31.56 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 60.58 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 57.43 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 42.39 sec elapsed
## Decoding state sequence... DONE
## =======================================================================
## Fitting HMM with 2 states and 1 data stream
## -----------------------------------------------------------------------
## ODBA ~ weibull(shape=~1, scale=~1)
##
## Transition probability matrix formula: ~1
##
## Initial distribution formula: ~1
## =======================================================================
## DONE
## 26.16 sec elapsed
## Decoding state sequence... DONE
all_prepped_df <- bind_rows(all_prepped) # bind all activity data into one dataframe
all_prepped_df_nested <- all_prepped_df |> group_by(ID) |> nest()
all_prepped_df_nested$sex <- c("m", "m", "m", "f", "f", "f", "f", "m", "m", "f")
all_prepped_df_nested$age <- c(1, 10, 5, 1, 3, 2, 2, 3, 10, 8)
all_prepped_df_nested$origin <- c("o", "o", "c", "o", "o", "o", "o", "c", "c","o")
# estimate the sun's azimuth and angle from the latitude, longitude and time
sun_df <- data.frame(oce::sunAngle(all_prepped_df$DateTimeNZ, latitude = -45.77813, longitude = 170.604312))
all_prepped_df <- all_prepped_df_nested |> unnest(cols = c(data)) |> ungroup() |>
mutate(ID = factor(ID),
id_numeric = as.numeric(ID),
states_chr = factor(states, levels = c("1", "2"), labels = c("inactive", "active")),
sex = factor(sex),
origin = factor(origin),
year = lubridate::year(DateTimeNZ),
month = lubridate::month(DateTimeNZ),
month_chr = lubridate::month(DateTimeNZ, label = TRUE, abbr = FALSE),
month_year = lubridate::floor_date(DateTimeNZ, unit = "month"),
day = lubridate::day(DateTimeNZ),
yday = as.numeric(round(difftime(DateTimeNZ, min(DateTimeNZ), units = "days"))),
time = hms::as_hms(DateTimeNZ),
time_no_dlt_savs = hms::as_hms(DateTimeGMT + 43200), # add on 12 hours in seconds to convert to NZ time without daylight savings
sun_azimuth = sun_df$azimuth,
sun_altitude = sun_df$altitude)
head(all_prepped_df, 10)
## # A tibble: 10 × 22
## ID GMT.Time ODBA DateTimeGMT DateTimeNZ Temp states
## <fct> <chr> <int> <dttm> <dttm> <dbl> <fct>
## 1 A05 22/09/2020 … 246 2020-09-22 01:18:00 2020-09-22 13:18:00 20 2
## 2 A05 22/09/2020 … 146 2020-09-22 01:19:00 2020-09-22 13:19:00 19.5 2
## 3 A05 22/09/2020 … 140 2020-09-22 01:20:00 2020-09-22 13:20:00 19.5 2
## 4 A05 22/09/2020 … 128 2020-09-22 01:21:00 2020-09-22 13:21:00 19 2
## 5 A05 22/09/2020 … 37 2020-09-22 01:22:00 2020-09-22 13:22:00 19 2
## 6 A05 22/09/2020 … 105 2020-09-22 01:23:00 2020-09-22 13:23:00 19 2
## 7 A05 22/09/2020 … 336 2020-09-22 01:24:00 2020-09-22 13:24:00 19 2
## 8 A05 22/09/2020 … 404 2020-09-22 01:25:00 2020-09-22 13:25:00 19 2
## 9 A05 22/09/2020 … 246 2020-09-22 01:26:00 2020-09-22 13:26:00 19 2
## 10 A05 22/09/2020 … 418 2020-09-22 01:27:00 2020-09-22 13:27:00 18.5 2
## # ℹ 15 more variables: sex <fct>, age <dbl>, origin <fct>, id_numeric <dbl>,
## # states_chr <fct>, year <dbl>, month <dbl>, month_chr <ord>,
## # month_year <dttm>, day <int>, yday <dbl>, time <time>,
## # time_no_dlt_savs <time>, sun_azimuth <dbl>, sun_altitude <dbl>
# uncomment to save the data frame
# write_csv(all_prepped_df, paste0("outputs/data_files/all_prepped_df_", Sys.Date(), ".csv"))
i = 8
all_prepped_df |> dplyr::filter(ID == tags[i] & month == 9 & day == 1) |>
ggplot() +
geom_point(aes(x = DateTimeNZ, y = ODBA), size = 0.75, alpha = 0.5) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_y_continuous() +
# scale_colour_viridis_d() +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16)) # Increase legend text size
# ggsave(paste0("outputs/plots/scatter_singleday_ODBA_id_12_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
all_prepped_df |> dplyr::filter(ID == tags[i] & month == 9 & day == 1) |>
ggplot() +
geom_point(aes(x = DateTimeNZ, y = ODBA, colour = states), size = 1.5, alpha = 0.75) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_y_continuous() +
scale_colour_manual(values = c("orange", "skyblue")) +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16)) # Increase legend text size
# uncomment to save the plot
# ggsave(paste0("outputs/plots/scatter_singleday_states_ODBA_id_12_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
Here we estimate the time each individual spent in each state to determine activity budgets.
prop <- vector(mode = "list", length = length(all_activity))
for (i in 1:length(all_activity)) {
prop[[i]] <- all_prepped[[i]] |> group_by(states) |> summarise(n = n()) %$% n
}
names(prop) <- tags
prop_df <- data.frame(do.call(rbind, prop)) |> mutate(id = tags)
colnames(prop_df) <- c("inactive", "active", "id")
prop_df <- prop_df |> mutate(prop_inactive = inactive/(inactive + active), prop_active = active/(inactive + active))
# uncomment to save the data frame
# write_csv(prop_df, paste0("outputs/data_files/minutes_active_inactive_", Sys.Date(), ".csv"))
prop_df_long <- prop_df |> pivot_longer(cols = !"id", values_to = "value")
Showing a single day for one individual, to illustrate the actogram concept. We represent the two dimensions of time (x-axis) and ODBA (y-axis) by plotting the time of day on the x-axis and ODBA as the colour. This allows us to stack days on top of one another and assess behavioural changes over longer time periods.
i = 8
all_prepped_df %>% filter(ID == tags[i] & month == 9 & day == 1) %>% #
ggplot(aes(x = DateTimeNZ, y = day)) +
geom_vpline(aes(colour = log(ODBA)), height = 1) +
# scale_colour_gradient(low = "dark blue", high = "white") +
scale_colour_viridis_c() +
scale_x_datetime("Time", date_breaks = "4 hours", date_labels = "%H:%M") +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_classic() +
theme(legend.position = "none")
## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# uncomment to save the plot
# ggsave(paste0("outputs/plots/actogram_singleday_logODBA_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
Showing all days for all individuals - month by month. I have commented out this code as the full time period is more informative, but a month by month actogram may also be useful.
# for(i in 1:length(tags)) {
#
# print(all_prepped_df %>% filter(ID == tags[i] & month %in% c(9:12, 1:2)) %>% #
# ggplot(aes(x = time, y = day)) +
# geom_vpline(aes(colour = log(ODBA)), height = 1) +
# # scale_colour_gradient(low = "dark blue", high = "white") +
# scale_colour_viridis_c() +
# scale_y_reverse() +
# facet_wrap(~ month_year) +
# ggtitle(paste0("Kākā ID: ", tags[i])) +
# theme_bw() +
# theme(legend.position = "none"))
#
# # uncomment to save the plot
# # ggsave(paste0("outputs/plots/actogram_logODBA_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
#
# }
I filtered less than 175 days as I recovered some tags and charged them, and the activity represents the time since the tag was recovered. I also coloured by log(ODBA) due to the right-skew of the data.
for(i in 1:length(tags)) {
print(all_prepped_df %>% filter(ID == tags[i] & yday < 175) %>% # & month %in% c(9:12, 1:2)
ggplot(aes(x = time_no_dlt_savs, y = yday)) +
geom_vpline(aes(colour = log(ODBA)), height = 1) +
scale_colour_viridis_c() +
# scale_colour_gradient(low = "dark blue", high = "white") +
scale_y_reverse("Day since start of study (27th August 2020)", limits = c(175, 0), breaks = seq(180, 0, -30)) +
scale_x_time("Time (without daylight savings)", breaks = seq(0, 86400, 14400)) +
# facet_wrap(~ month_year) +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_bw() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
))
# uncomment to save the plot
# ggsave(paste0("outputs/plots/actogram_logODBA_full_period_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
}
It is very clear in the plots above that there are patterns that relate to the time of day, and correspond to changing day lengths. When we plot the time of nautical sunrise (when the angle of the sun is less than -12 degrees below the horizon, relating to the time when the sky is no longer completely dark - i.e. first light) and sunset (when the angle of the sun is 0) we can see that the kākā are mostly active during the day and mostly inactive at night. However, when daylengths are short there is a higher proportion of night time activity, and when day lengths are long there is a higher proportion of day time inactivity.
These patterns prompted us to assess how long each kākā is active each day despite changing day lengths.
An interesting pattern is kākā ID 11, which we suspect was nesting. GPS locations also failed much of the time during the mostly inactive period, which is consistent with being in a tree hollow nesting.
for(i in 1:length(tags)) {
print(all_prepped_df %>% filter(ID == tags[i] & yday < 175) %>% # & month %in% c(9:12, 1:2)
ggplot(aes(x = time_no_dlt_savs, y = yday)) +
geom_vpline(aes(colour = log(ODBA)), height = 1) +
scale_colour_viridis_c() +
# scale_colour_gradient(low = "dark blue", high = "white") +
geom_point(data = all_prepped_df %>% filter(sun_altitude > -0.065 & sun_altitude < 0.065),
aes(x = time_no_dlt_savs, y = yday), colour = "red", size = 0.5) +
geom_point(data = all_prepped_df %>% filter(sun_altitude < -11.925 & sun_altitude > -12.05),
aes(x = time_no_dlt_savs, y = yday), colour = "black", size = 0.5) +
scale_y_reverse("Day since start of study (27th August 2020)", limits = c(175, 0), breaks = seq(180, 0, -30)) +
scale_x_time("Time (without daylight savings)", breaks = seq(0, 86400, 14400)) +
# facet_wrap(~ month_year) +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_bw() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
))
# uncomment to save the plot
# ggsave(paste0("outputs/plots/actogram_full_period_sun_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
}
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Warning: Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
## Warning: Removed 9 rows containing missing values (`geom_point()`).
## Removed 10 rows containing missing values (`geom_point()`).
This may be informative for other studies, so I have left the commented-out code here.
# for(i in 1:length(tags)) {
#
# print(all_prepped_df %>% filter(ID == tags[i] & month %in% c(9:12, 1:2)) %>% #
# ggplot(aes(x = time, y = day)) +
# geom_vpline(aes(colour = states), height = 1) +
# # scale_colour_gradient(low = "dark blue", high = "white") +
# scale_colour_viridis_d() +
# scale_y_reverse() +
# facet_wrap(~ month_year) +
# ggtitle(paste0("Kākā ID: ", tags[i])) +
# theme_bw() +
# theme(legend.position = "none"))
#
# }
These plots show the same as the actograms above, although the colouration is by states rather than by the magnitude of ODBA.
for(i in 1:length(tags)) {
print(all_prepped_df %>% filter(ID == tags[i] & yday < 175) %>% # & month %in% c(9:12, 1:2)
ggplot(aes(x = time_no_dlt_savs, y = yday)) +
geom_vpline(aes(colour = states), height = 1) +
# scale_colour_gradient(low = "dark blue", high = "white") +
# scale_colour_viridis_d() +
scale_colour_manual(values = c("orange", "skyblue")) +
scale_y_reverse(limits = c(175, 0)) +
scale_x_time("Time (without daylight savings)", breaks = seq(0, 86400, 14400)) +
# facet_wrap(~ month_year) +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_bw() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
))
# uncomment to save the plot
# ggsave(paste0("outputs/plots/actogram_states_full_period_id_", tags[i], "_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
}
This plot shows the difference between two days that have a large difference in day length. When the day length is short, this kākā had an active period from roughly 2am - 4am. When the day length is long, this kākā was not active at night, but had an inactive period during the day.
i = 8 # select individual - ID 45512
sept_day_id12 <- all_prepped_df %>% filter(ID == tags[i] & month == 9 & day == 1)
# unique(diff(sign(jan_day_id12$sun_altitude))) # this function finds when the sun altitude changes sign (i.e. sunrise and sunset)
sept_plot <- sept_day_id12 %>%
ggplot() +
# adding a rectangle for nautical twilight
geom_rect(aes(xmin = DateTimeNZ[which(diff(sign(sun_altitude - -12)) > 0)],
xmax = DateTimeNZ[which(diff(sign(sun_altitude)) > 0)], ymin = -Inf, ymax = Inf), fill = "skyblue", alpha = 0.01) +
geom_rect(aes(xmin = DateTimeNZ[which(diff(sign(sun_altitude - -12)) < 0)],
xmax = DateTimeNZ[which(diff(sign(sun_altitude)) < 0)], ymin = -Inf, ymax = Inf), fill = "skyblue", alpha = 0.01) +
geom_point(aes(x = DateTimeNZ, y = ODBA, colour = states), size = 0.75, alpha = 0.5) +
geom_line(aes(x = DateTimeNZ, y = 3*sun_altitude)) +
geom_hline(yintercept = 0, linetype = "dashed") +
# vertical line at time when sun rises
geom_vline(xintercept = sept_day_id12$DateTimeNZ[which(diff(sign(sept_day_id12$sun_altitude)) > 0)], linetype = "dashed") +
# vertical line at time when sun sets
geom_vline(xintercept = sept_day_id12$DateTimeNZ[which(diff(sign(sept_day_id12$sun_altitude)) < 0)], linetype = "dashed") +
scale_y_continuous(limits = c(min(3*sept_day_id12$sun_altitude), 750)) +
# scale_colour_viridis_d() +
scale_colour_manual(values = c("orange", "skyblue")) +
ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
sept_plot
## Warning: Removed 1 rows containing missing values (`geom_point()`).
jan_day_id12 <- all_prepped_df %>% filter(ID == tags[i] & month == 1 & day == 1)
jan_plot <- jan_day_id12 %>%
ggplot() +
# adding a rectangle for nautical twilight
geom_rect(aes(xmin = DateTimeNZ[which(diff(sign(sun_altitude - -12)) > 0)],
xmax = DateTimeNZ[which(diff(sign(sun_altitude)) > 0)], ymin = -Inf, ymax = Inf), fill = "skyblue", alpha = 0.01) +
geom_rect(aes(xmin = DateTimeNZ[which(diff(sign(sun_altitude - -12)) < 0)],
xmax = DateTimeNZ[which(diff(sign(sun_altitude)) < 0)], ymin = -Inf, ymax = Inf), fill = "skyblue", alpha = 0.01) +
geom_point(aes(x = DateTimeNZ, y = ODBA, colour = states), size = 0.75, alpha = 0.5) +
geom_line(aes(x = DateTimeNZ, y = 3*sun_altitude)) +
geom_hline(yintercept = 0, linetype = "dashed") +
# vertical line at time when sun rises
geom_vline(xintercept = jan_day_id12$DateTimeNZ[which(diff(sign(jan_day_id12$sun_altitude)) > 0)], linetype = "dashed") +
# vertical line at time when sun sets
geom_vline(xintercept = jan_day_id12$DateTimeNZ[which(diff(sign(jan_day_id12$sun_altitude)) < 0)], linetype = "dashed") +
scale_y_continuous(limits = c(min(3*sept_day_id12$sun_altitude), 750)) +
# scale_colour_viridis_d() +
scale_colour_manual(values = c("orange", "skyblue")) +
# ggtitle(paste0("Kākā ID: ", tags[i])) +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
jan_plot
ggarrange(sept_plot + rremove("xlab"), jan_plot, ncol = 1, nrow = 2)
## Warning: Removed 1 rows containing missing values (`geom_point()`).
# uncomment to save the plot
# ggsave(paste0("outputs/plots/HMM_45512_Sept_Jan_ggarranged_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
Here we calculated the active time per day for each individual.
active_time_day_id <- all_prepped_df |>
group_by(ID, yday) |>
summarise(inactive_time = sum(states == 1), active_time = sum(states == 2), total = n()) |>
filter(total > 1400 & yday > 3 & yday < 180) |>
ungroup() |>
mutate(active_prop = active_time/total)
## `summarise()` has grouped output by 'ID'. You can override using the `.groups`
## argument.
active_time_day_id_nested <- active_time_day_id %>% group_by(ID) %>% nest()
active_time_day_id_nested$sex <- c("m", "m", "m", "f", "f", "f", "f", "m", "m", "f")
active_time_day_id_nested$age <- c(1, 10, 5, 1, 3, 2, 2, 3, 10, 8)
active_time_day_id_nested$origin <- c("o", "o", "c", "o", "o", "o", "o", "c", "c","o")
active_time_day_id <- active_time_day_id_nested %>% unnest(data)
# active_time_day_id |> filter(ID == "A12" & yday == 6) # checking for single id and day
As we were curious about how long each kākā is active for each day, we plotted the active time per day for a single individual across the study period. As you can see, it is remarkably consistent at roughly 8/16 hours inactive/active even as day lengths change significantly.
active_time_day_id |> filter(ID == "A12") |> ggplot() +
geom_hline(yintercept = 480, linetype = "dashed") +
geom_hline(yintercept = 960, linetype = "dashed") +
geom_smooth(aes(x = yday, y = inactive_time, colour = ID), fill = "orange", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = inactive_time, colour = ID), colour = "orange", alpha = 0.85) +
geom_smooth(aes(x = yday, y = active_time, colour = ID), fill = "skyblue", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = active_time, colour = ID), colour = "skyblue", alpha = 0.85) +
scale_colour_viridis_d() +
scale_x_continuous("Day since start of study", breaks = seq(0,180, 30)) +
scale_y_continuous("Number of minutes") +
ggtitle("Active time per day") +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
# uncomment to save the plot
# ggsave(paste0("outputs/plots/active_time_id_12_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300, scale = 1)
As we can see, this is also consistent across individuals, although some individuals have quite different patterns.
active_time_day_id |> ggplot() +
geom_hline(yintercept = 480, linetype = "dashed") +
geom_hline(yintercept = 960, linetype = "dashed") +
geom_smooth(aes(x = yday, y = inactive_time, colour = ID), fill = "orange", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = inactive_time, colour = ID), alpha = 0.25) +
geom_smooth(aes(x = yday, y = active_time, colour = ID), fill = "skyblue", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = active_time, colour = ID), alpha = 0.25) +
scale_colour_viridis_d() +
scale_x_continuous("Day since start of study", breaks = seq(0,180, 30)) +
scale_y_continuous("Number of minutes") +
ggtitle("Active time per day per individual") +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
# uncomment to save the plot
# ggsave(paste0("outputs/plots/active_time_all_ids_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
If we look at the plots for males and females separately, we can see that males are much more consistent across the study period. We suspect that this is due to nesting behaviour, as we can confidently say that at least one female - kākā ID 11, who appears green in the female-only plot, was nesting during the study period. This is also evidenced by the actogram above.
active_time_day_id |> filter(sex == "m") |> ggplot() +
geom_hline(yintercept = 480, linetype = "dashed") +
geom_hline(yintercept = 960, linetype = "dashed") +
geom_smooth(aes(x = yday, y = inactive_time, colour = ID), fill = "orange", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = inactive_time, colour = ID), alpha = 0.25) +
geom_smooth(aes(x = yday, y = active_time, colour = ID), fill = "skyblue", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = active_time, colour = ID), alpha = 0.25) +
scale_colour_viridis_d() +
scale_x_continuous("Day since start of study", breaks = seq(0,180, 30)) +
scale_y_continuous("Number of minutes") +
ggtitle("Active time per day per individual - males only (n = 5)") +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
# uncomment to save the plot
# ggsave(paste0("outputs/plots/active_time_males_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
active_time_day_id |> filter(sex == "f") |> ggplot() +
geom_hline(yintercept = 480, linetype = "dashed") +
geom_hline(yintercept = 960, linetype = "dashed") +
geom_smooth(aes(x = yday, y = inactive_time, colour = ID), fill = "orange", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = inactive_time, colour = ID), alpha = 0.25) +
geom_smooth(aes(x = yday, y = active_time, colour = ID), fill = "skyblue", span = 0.5, alpha = 0.15, method = "loess", se = TRUE) +
geom_point(aes(x = yday, y = active_time, colour = ID), alpha = 0.25) +
scale_colour_viridis_d() +
scale_x_continuous("Day since start of study", breaks = seq(0,180, 30)) +
scale_y_continuous("Number of minutes") +
ggtitle("Active time per day per individual - females only (n = 5)") +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
# uncomment to save the plot
# ggsave(paste0("outputs/plots/active_time_females_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
When we look at the average active time per day per individual, we can see that it is consistent across ages at roughly 16 hours of active time (960 minutes - dashed line).
active_avg_age <- active_time_day_id |> ggplot() +
# geom_hline(yintercept = 480, linetype = "dashed") +
geom_hline(yintercept = 960, linetype = "dashed") +
geom_boxplot(aes(x = factor(age), y = active_time, group = ID, fill = age), alpha = 0.75) +
scale_y_continuous("Number of minutes", limits = c(0,1440)) +
scale_x_discrete("Age") +
ggtitle("Active time") +
scale_fill_viridis_c() +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
active_avg_age
The average daily active time also doesn’t have any correlation to sex (although there’s slightly more variance within each females daily active time). This is despite the nesting behaviour observed for females, which is an interesting result - even though there are many days that are below 16 hours for some individuals, it is recovered at other times during the study period and the overall average is still 16 hours.
active_avg_sex <- active_time_day_id |> ggplot() +
# geom_hline(yintercept = 480, linetype = "dashed") +
geom_hline(yintercept = 960, linetype = "dashed") +
geom_boxplot(aes(x = factor(sex), y = active_time, group = ID, fill = sex), alpha = 0.75) +
scale_y_continuous("Number of minutes", limits = c(0,1440)) +
scale_x_discrete("Sex") +
scale_fill_viridis_d() +
theme_classic() +
theme(legend.position = "none",
plot.title = element_text(size = 25), # Increase plot title size
axis.title = element_text(size = 20), # Increase axis title size
axis.text = element_text(size = 16), # Increase axis text size
legend.title = element_text(size = 20), # Increase legend title size
legend.text = element_text(size = 16) # Increase legend text size
)
active_avg_sex
ggarrange(active_avg_age, active_avg_sex + rremove("ylab"), ncol = 2, nrow = 1, align = "hv")
# uncomment to save the plot
# ggsave(paste0("outputs/plots/active_time_avg_age_sex_", Sys.Date(), ".png"), width=300, height=170, units="mm", dpi = 300)
We can relate activity behaviour to the spatial locations to assess where the behaviours take place. As the GPS devices were set to take locations every 3 hours (or 2 hours for one individual), we can just use the state labels from the ODBA-HMM analysis to match to the time of the GPS locations. Therefore, there’s no behavioural state identification of the GPS locations explicitly, but we can infer the behaviour from the ODBA-HMM state labels, which is much higher frequency than the GPS locations.
gps_files <- paste0("data/", list.files(path = "data", pattern = "speed")) # writes .csv file names to chr vector - 2021 is common across all the datasets of interest
gps_list <- map(gps_files, read.csv) # reads csv files
names(gps_list) <- as.character(unique(all_prepped_df$ID))
gps_all_df <- bind_rows(gps_list)
gps_all_nested <- gps_all_df %>% nest(data = -id)
gps_all_nested$ID <- as.character(unique(all_prepped_df$ID))
gps_all_df <- gps_all_nested |> unnest(cols = c(data)) |> mutate(X.1 = NULL, # remove a redundant index variable
DateTimeNZ = with_tz(DateTime, "Pacific/Auckland"),
DateTimeNZ_R = as.POSIXct(round.POSIXt(DateTimeNZ, units = "mins"))
)
# here we match the rounded times of the GPS locations and the time of the ODBA-HMM state labels
gps_all_with_states_df <- left_join(gps_all_df, all_prepped_df, by = join_by(DateTimeNZ_R == DateTimeNZ, ID == ID))
The amt package has simple functions for Kernel Desnity
Estimates (KDEs) from GPS locations, so we convert the GPS data frames
to the amt track object for each individual. We can then
use the amt function kde to create the KDEs
for each individual, which we can delineate by state.
As separating the GPS locations by state removes much of the temporal
autocorrelation between locations, we calculate KDEs rather than a
method that considers autocorrelation, such as Autocorrelated Kernel
Density Estimation (AKDE) (although the amt package does
have functions for calculating AKDEs).
We also create a template raster with a consistent resolution and extent for the KDEs.
gps_all_nested_tracks <- gps_all_with_states_df %>% nest(data = -id) %>%
mutate(trk = map(data, function(d) {
make_track(d, lon, lat, DateTimeNZ, crs = 4326, all_cols = TRUE) %>%
transform_coords(2193)
}))
# unnest to find the min and max values for the extent of the tracks
extent <- gps_all_nested_tracks %>% unnest(trk) %>%
summarise(xmin = min(x_), xmax = max(x_), ymin = min(y_), ymax = max(y_))
# create a template raster for the KDEs
template_raster <- rast(xmin = extent$xmin, xmax = extent$xmax, ymin = extent$ymin, ymax = extent$ymax, res = 50, crs = crs("epsg:2193"))
In this loop we calculate the KDEs for each state for individual, and then plot the KDEs for each state overlaid. We also calculate Bhattacharrya’s affinity (BA) for each KDE, which is a measure of the similarity between two distributions. We can use this to assess how similar the KDEs are between states. BA considers the density at each point in the KDE, rather than purely spatial overlap, and is therefore a robust measure of similarity which considers the intensity of space use.
track_sf_points_list <- vector(mode = "list", length = length(gps_all_nested_tracks$trk))
for(i in 1:length(gps_all_nested_tracks$trk)) {
# KDE for all the locations at once
# kde_full <- hr_kde(gps_all_nested_tracks$trk[[i]], trast = template_raster, levels = c(0.5, 0.75, 0.95))
# kde_full_isopleths <- hr_isopleths(kde_full, levels = c(0.5, 0.75, 0.95))
# KDE for inactive locations only
kde_inactive <- hr_kde(gps_all_nested_tracks$trk[[i]] |> filter(states == 1), trast = template_raster, levels = c(0.5, 0.75, 0.95))
kde_inactive_isopleths <- hr_isopleths(kde_inactive, levels = c(0.5, 0.75, 0.95))
# KDE for active locations only
kde_active <- hr_kde(gps_all_nested_tracks$trk[[i]] |> filter(states == 2), trast = template_raster, levels = c(0.5, 0.75, 0.95))
kde_active_isopleths <- hr_isopleths(kde_active, levels = c(0.5, 0.75, 0.95))
# calculate the overlap between active and inactive UDs
# print(hr_overlap(kde_inactive, kde_active, type = "ba"))
# convert the track to an sf object for plotting the points as well
track_sf_points_list[[i]] <- as_sf_points(gps_all_nested_tracks$trk[[i]])
print(ggplot() +
geom_sf(data = kde_active_isopleths, fill = "skyblue", alpha = 0.25) +
geom_sf(data = kde_inactive_isopleths, fill = "orange", alpha = 0.25) +
geom_sf(data = track_sf_points_list[[i]] |> filter(states == 2), colour = "skyblue", size = 1, alpha = 0.5) +
geom_sf(data = track_sf_points_list[[i]] |> filter(states == 1), colour = "orange", size = 1, alpha = 0.5) +
ggspatial::annotation_scale(location = "bl", width_hint = 0.25, text_col = "black") +
ggspatial::annotation_north_arrow(location = "tr") +
scale_x_continuous("Easting") +
scale_y_continuous("Northing") +
ggtitle(paste0("Kākā ID: ", tags[i]),
subtitle = paste0("BA overlap = ", round(hr_overlap(kde_inactive, kde_active, type = "ba")$overlap, digits = 3))) +
theme_bw() +
theme(plot.title = element_text(size = 25), # Increase plot title size
plot.subtitle = element_text(size = 18), # Increase plot subtitle size
axis.title = element_text(size = 20)))
# uncomment to save the plot
# ggsave(paste0("outputs/plots/kde_overlap_id_", tags[i], "_", Sys.Date(), ".png"), width=150, height=170, units="mm", dpi = 300)
}